Extracting Features for Verifying WordNet

نویسندگان

  • Altangerel Chagnaa
  • Cheolyoung Ock
  • Hoseop Choe
چکیده

WordNet is a semantic lexicon for the English language and many countries have been developing their own WordNet. Almost, all of the WordNets are manually built and unfortunately these WordNets are not verified and are being used in many knowledge-based applications. In this paper we aimed at the clustering based verification of a manually built lexical taxonomy WordNet, namely the Korean WordNet, U-WIN. For this purpose two kinds of clustering methods are used: K-Means approach and ICA based approach. As a result the ICA based approach gives better result, and it shows very effective characteristic for extracting features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Introducing a method for extracting features from facial images based on applying transformations to features obtained from convolutional neural networks

In pattern recognition, features are denoting some measurable characteristics of an observed phenomenon and feature extraction is the procedure of measuring these characteristics. A set of features can be expressed by a feature vector which is used as the input data of a system. An efficient feature extraction method can improve the performance of a machine learning system such as face recognit...

متن کامل

Syntax-based Concept Extraction for Question Answering Using SEMEX

The SEMEX tool for question answering is presented. Its architecture and features for extracting from input text a network of concept nodes that index syntax-based logical forms, are described. Methods are shown for decomposing questions into boolean combinations of question patterns and for using the concept network and logical forms together with WordNet for question answering. SEMEX's encour...

متن کامل

Automatic Evaluation of Wordnet Synonyms and Hypernyms

In recent times, wordnets have become indispensable resources for Natural Language Processing. However, the creation of wordnets is a time consuming and manpower intensive proposition. This fact has led to attempts at quickly fixing a wordnet using text repositories such as the web and certain corpora, and also by translating an existing wordnet into another language. However, the results of su...

متن کامل

حس‌نگار : شبکه واژگان حسی فارسی

Awareness of others' opinions plays a crucial role in the decision making process performed by simple customers to top-level executives of manufacturing companies and various organizations. Today, with the advent of Web 2.0 and the expansion of social networks, a vast number of texts related to people's opinions have been created. However, exploring the enormous amount of documents, various opi...

متن کامل

Machine Learning of Syntactic Attachment from Morphosyntactic and Semantic Co-occurrence Statistics

The paper presents a novel approach to extracting dependency information in morphologically rich languages using co-occurrence statistics based not only on lexical forms (as in previously described collocation-based methods), but also on morphosyntactic and wordnet-derived semantic properties of words. Statistics generated from a corpus annotated only at the morphosyntactic level are used as fe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007